Data Collection Rituals

SDS 237: Final Project Checkpoint

Part 1: Detail a Data Collection Practice

Determine a stakeholder that is predominantly responsible for data collection in your dataset. Find a YouTube video of this stakeholder at work or an instruction manual detailing the steps they should take to collect the data. In 200-300 words describe the rituals of the data collection, paying close attention to details. Questions you might consider include: What standards (if any) must the data collectors follow when they encounter the environment on which they are collecting data? What tools do they use? What do they wear? During what time periods do they collect the data? Where do they go to collect the data? Are they alone or accompanied by others? How do they engage their senses when collecting the data?

Part 2: Represent the Data Collection as a Flow Chart

Install the DiagrammeR package in your console with the following code:

install.packages(“DiagrammeR”)

Run the code below to get a sense of the final chart we will be producing. In the code below, add new tabs for every node you’d like represented in your workflow, and replace the label text with the text you’d like to appear in that node. Under the comment ‘#set node layout’, list how tabs should point to one another. If you’d like to customize this further, you can refer here: https://mikeyharper.uk/flowcharts-in-r-using-diagrammer/

library(DiagrammeR)
grViz(diagram = "digraph flowchart {
      node [fontname = Arial, shape = oval, color = Blue, style = filled, fontcolor = White]        
      tab1 [label = 'Step 1']
      tab2 [label = 'Step 2']
      tab3 [label = 'Step 3, part 1']
      tab4 [label = 'Step 3, part 2']
      tab5 [label = 'Step 4']
# set up node layout
      tab1 -> tab2;
      tab2 -> tab3;
      tab2 -> tab4;
      tab3 -> tab5;
      tab4 -> tab5
      }
      ")

Part 3: Data Commensurability

What changes to data environments might make the data collected in one time period or setting incommensurable with data collected in another time period or setting? What mechanisms, if any, have been put in place to mitigate these effects? How successful are these mechanisms in promoting comparability within the data? Write-up your response in 200-300 words.